{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Benchmark Creation\n",
    "\n",
    "Notebook to create the report file to export Benchmark data (to be released)\n",
    "\n",
    "**Note:** : this notebook assumes the use of **Python 3**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Preamble: Settings Django Environment"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load preamble_directives.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Benchmark Report (per Project)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from source_code_analysis.models import SoftwareProject"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "projects = SoftwareProject.objects.all()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The **Replication Dataset** contains the following report files:\n",
    "\n",
    "* (`Benchmark_Coherence_Data.txt`, `Benchmark_Raw_Data.txt`):\n",
    "    Report files containing information about the Coherence and the raw data of methods from all the 4 considered \n",
    "    Software Systems.\n",
    "* (`CoffeeMaker_Coherence_Data.txt`, `CoffeeMaker_Raw_Data.txt`):\n",
    "    Report files providing the Coherence and the raw data of methods gathered from the **CoffeeMaker** Software System.\n",
    "* (`JFreeChart060_Coherence_Data.txt`, `JFreeChart060_Raw_Data.txt`):\n",
    "    Report files providing the Coherence and the raw data of methods gathered from the **JFreeChart 0.6.0** Software System.\n",
    "* (`JFreeChart071_Coherence_Data.txt`, `JFreeChart071_Raw_Data.txt`):\n",
    "    Report files providing the Coherence and the raw data of methods gathered from the **JFreeChart 0.7.1** Software System.\n",
    "* (`JHotDraw741_Coherence_Data.txt`, `JHotDraw741_Raw_Data.txt`):\n",
    "    Report files providing the Coherence and the raw data of methods gathered from the **JHotDraw 7.4.1** Software System."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Coherence Data Report Structure\n",
    "\n",
    "Report files providing information about the *Coherence* of methods are structured according to the CSV \n",
    "(i.e., *Comma Separated Values*) format.\n",
    "Each line of the file contains the following information:\n",
    "\n",
    "    method_id, coherence\n",
    "    \n",
    "* `method_id`: the unique identifier of the corresponding method\n",
    "* `coherence` : the coherence value associated to the comment and the implementation of the referred method. \n",
    "    Allowed Coherence Values are: `NOT_COHERENT` and `COHERENT`. \n",
    "    In case, it would be more than straightforward to translate these values into `0`, `1` values, respectively."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Write Coherence Report\n",
    "def write_coherence_report(coherence_report_filepath, target_methods):\n",
    "    with open(coherence_report_filepath, 'w') as coherence_report:\n",
    "        for method in target_methods:\n",
    "            evaluation = method.agreement_evaluations.all()[0]\n",
    "            coherence_value = 'COHERENT' if evaluation.agreement_vote in [3, 4] else 'NOT_COHERENT'\n",
    "            coherence_report.write('{0}, {1}\\n'.format(method.pk, coherence_value))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Raw Data Report Structure\n",
    "\n",
    "All the report files containing the raw data of the methods share exactly the **same** *multiline* structure. \n",
    "That is (*for each method*): \n",
    "\n",
    "    method_id, method_name, class_name, software_system\n",
    "    filepath, start_line, end_line, \n",
    "    Length of the Head Comments\n",
    "    Head Comment\n",
    "    Length of the Implementation\n",
    "    Method Implementation\n",
    "    ###"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Write Raw Data Report\n",
    "def write_raw_data_report(raw_report_filepath, target_methods):\n",
    "    with open(raw_report_filepath, 'w') as raw_report:\n",
    "        for method in target_methods:\n",
    "            software_system_name = method.project.name + method.project.version.replace('.', '')\n",
    "            raw_report.write('{mid}, {method_name}, {class_name}, {software_system}\\n'.format(\n",
    "                                    mid=method.id, method_name=method.method_name, class_name=method.code_class.class_name,\n",
    "                                    software_system=software_system_name))\n",
    "            \n",
    "            method_fp = method.file_path\n",
    "            relative_filepath = method_fp[method_fp.find('extracted')+len('extracted')+1:]\n",
    "            raw_report.write('{filepath}, {start_line}, {end_line}\\n'.format(filepath=relative_filepath, \n",
    "                                                                             start_line=method.start_line, \n",
    "                                                                             end_line=method.end_line))\n",
    "            \n",
    "            raw_report.write('{comment_len}\\n'.format(comment_len=len(method.comment.splitlines())))\n",
    "            raw_report.write('{comment}'.format(comment=method.comment))\n",
    "            if not method.comment.endswith('\\n'):\n",
    "                raw_report.write('\\n')\n",
    "                \n",
    "            raw_report.write('{code_len}\\n'.format(code_len=len(method.code_fragment.splitlines())))\n",
    "            raw_report.write('{code}'.format(code=method.code_fragment))\n",
    "            if not method.code_fragment.endswith('\\n'):\n",
    "                raw_report.write('\\n')\n",
    "            \n",
    "            # Last Line of this method\n",
    "            raw_report.write('###\\n')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "RAW_DATA_SUFFIX = 'Raw_Data.txt'\n",
    "COHERENCE_DATA_SUFFIX = 'Coherence_Data.txt'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "# Create Report Folder\n",
    "report_folderpath = os.path.join(os.path.abspath(os.path.curdir), 'report_files')\n",
    "if not os.path.exists(report_folderpath):\n",
    "    os.makedirs(report_folderpath)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "all_methods_list = list()\n",
    "\n",
    "# Project-Specific Reports\n",
    "\n",
    "for project in projects:\n",
    "    software_system_name = project.name + project.version.replace('.', '')\n",
    "    target_methods = list()\n",
    "    project_methods = project.code_methods.order_by('pk')\n",
    "    # Collect Project Methods whose evaluations are Coherent|Not Coherent\n",
    "    for method in project_methods:\n",
    "        evaluation = method.agreement_evaluations.all()[0]\n",
    "        if not evaluation.wrong_association and evaluation.agreement_vote != 2:\n",
    "            target_methods.append(method)\n",
    "            \n",
    "    all_methods_list.extend(target_methods)\n",
    "    \n",
    "    # Coherence Data Report\n",
    "    coherence_report_filename = '{0}_{1}'.format(software_system_name, COHERENCE_DATA_SUFFIX)\n",
    "    coherence_report_filepath = os.path.join(report_folderpath, coherence_report_filename)\n",
    "    \n",
    "    write_coherence_report(coherence_report_filepath, target_methods)\n",
    "    \n",
    "    # Raw Data Report\n",
    "    raw_report_filename = '{0}_{1}'.format(software_system_name, RAW_DATA_SUFFIX)\n",
    "    raw_report_filepath = os.path.join(report_folderpath, raw_report_filename)\n",
    "    \n",
    "    write_raw_data_report(raw_report_filepath, target_methods)\n",
    "    \n",
    "\n",
    "# -- Entire Benchmark Reports\n",
    "\n",
    "# Coherence Data Report\n",
    "coherence_report_filename = '{0}_{1}'.format('Benchmark', COHERENCE_DATA_SUFFIX)\n",
    "coherence_report_filepath = os.path.join(report_folderpath, coherence_report_filename)\n",
    "\n",
    "write_coherence_report(coherence_report_filepath, all_methods_list)\n",
    "\n",
    "# Raw Data Report\n",
    "raw_report_filename = '{0}_{1}'.format('Benchmark', RAW_DATA_SUFFIX)\n",
    "raw_report_filepath = os.path.join(report_folderpath, raw_report_filename)\n",
    "\n",
    "write_raw_data_report(raw_report_filepath, all_methods_list)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    ""
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3.0
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.4.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}